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Abstract 

We study the problem of reconstructing a multivariate trigonometric polynomial hav¬ 
ing only few non-zero coefficients from few random samples. Inspired by recent work of 
Candes, Romberg and Tao we propose to recover the polynomial by Basis Pursuit, i.e., 
by U-minimization. Numerical experiments show that in many cases the trigonometric 
polynomial can be recovered exactly provided the number N of samples is high enough 
compared to the “sparsity” - the number of non-vanishing coefficients. However, N can be 
chosen small compared to the assumed maximal degree of the trigonometric polynomial. 
Hence, the proposed scheme may overcome the Nyquist rate. We present two theorems that 
explain this observation. Unexpectly, they establish a connection to an interesting combi¬ 
natorial problem concerning set partitions, which seemingly has not yet been considered 
before. 
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1 Introduction 

Recently, Candes, Romberg and Tao observed the surprising fact that it is possible to recover 
certain discrete functions exactly from vastly incomplete information on their discrete Fourier 
transform mm- The crucial property of these functions is their sparsity with respect to the 
canonical (Dirac) basis, i.e., their (unknown) support is very small. The recovery procedure 
consists in minimizing the U-norm of the signal subject to the constraint that the Fourier 
coefficients are matched. This task is also known as Basis Pursuit [Sj. Since minimizing 
the total variation norm can be reformulated as minimizing the U-norrn there are relevant 
applications in image processing, in particular, computer tomography M- 

This paper is concerned with the related problem of reconstructing a sparse trigonometric 
polynomial from few randomly chosen samples drawn from the continuous uniform distribution 
on [0, 2vr] d . By “sparse” we mean that only very few coefficients of the polynomial are non-zero. 
However, a priori we do not know the support of the coefficients. From a practical viewpoint 
considering such polynomials can be motivated as follows. First, trigonometric polynomials 
with a certain maximal degree model band-limited signals. Secondly, in many cases it seems 
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reasonable that only few coefficients (with unknown location) are large. Such a signal can at 
least be approximated by a sparse one. 

We propose to reconstruct the sparse polynomial from its random samples by Basis Pursuit 
similarly as in P El El El- From numerical experiments it is evident that this scheme can indeed 
reconstruct the polynomial exactly provided the number of samples is large enough with respect 
to the sparsity. When comparing the number of samples to the assumed maximal degree of 
the polynomial it turns out that this method may overcome the Nyquist rate by far. Thus, 
the described recovery method is very likely to have high potential for practical applications 
in signal processing. 

We will present two theorems that explain the observed phenomenon. Similar to |Hj the first 
one estimates the probability of exact reconstruction given an arbitrary sparse trigonometric 
polynomial. Hence, this can be viewed as a worst case estimate. Our second theorem is 
more directed towards an avarage case analysis. It gives a probability estimate for generic 
polynomials in the sense that the support of the coefficients is modelled as random set. A 
result of this type seems to be new. As one may expect it gives better probability estimates 
than the first one. However unexpectly, it relates the problem to a seemingly new and difficult 
combinatorial problem about set partitions. Unfortunately, we were not able to solve this 
problem in general, and as a consequence we cannot yet exploit fully our probability estimate. 
We have to leave the combinatorial aspect as an interesting open problem. 

We would like to mention some recent related work. In 13 El Candes et al. study stability 
aspects of the problem and investigate also recovery from few inner products with random 
vectors following Gaussian distributions and binary distributions. In some practical ex¬ 
amples are presented. The recovery from Gaussian measurements via Basis Pursuit is also 
investigated by Rudelson and Vershynin in m in the context of error correcting codes, while 
Tropp m studies the reconstruction by Orthogonal Matching Pursuit. In mm Donoho 
and Tsaig introduce the terminology “compressed sensing” for this range of problems and in 
HM33 probabilistic results concerning Basis Pursuit are discussed. A randomized sublinear al¬ 
gorithm for reconstructing sparse Fourier data is introduced and analyzed in EH- If the reader 
is interested in reconstructing not necessarily sparse trigonometric polynomial from random 
samples we refer to recent work of Bass and Grochenig [L, where probabilistic estimates of 
related condition numbers are developed. 

The paper is structured as follows. In Section [2] we describe the problem and present our 
main results. To this end we also need to introduce some background on set partitions. Section 
El will be devoted to the proofs. Section E] gives some more information on the combinatorial 
problem related to our second theorem. In Section El we present some plots of the probability 
bounds resulting from our theorems and finally SectionEJdescribes some numerical experiments. 

Acknowledgements: The author was supported by the European Union’s Human Po¬ 
tential Programme, under contract HPRN-CT-2002-00285 (HASSIP). He would like to thank 
Stefan Kunis for stimulating discussions on numerical aspects of the topic. Also he acknowl¬ 
edges interesting conversations with Justin Romberg and his mail correspondence with Em¬ 
manuel Candes on the subject. 
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2 Description of the Main Results 

2.1 The Setting 

Let Yiq = H d denote the space of all trigonometric polynomials of maximal order q £ No in 
dimension d. An element / of II g is of the form 

f{x)= Y, ^’ ^e[0,2vr] d , 

fc£[— q,q] d C\Z d 

with some Fourier coefficients € C. The dimension of H d will be denoted by D := (2 q + l) d . 
In the sequel we will use the short notation [— q, q] d instead of [— q, q] d H Z d . 

Through the rest of this paper we will be dealing with “sparse” trigonometric polynomials, 
i.e., we assume that the sequence of coefficients is supported only on a set T, which is 
much smaller than the dimension D of II g . However, a priori nothing is known about T apart 
from a maximum size. Thus, it is useful to introduce the set n q (M) = H d (M) C n g of all 
trigonometric polynomials whose Fourier coefficients are supported on a set T C [— q, q] d fl Z d 
satisfying \T\ < M, i.e., / 6 U q (M) is of the form f(x) = c k& lk x ■ Note that U q (M) is 

not a linear space. 

Our aim is to sample a trigonometric polynomial / of H q (M) at N randomly chosen points 
and try to reconstruct / from these samples. We model the sampling points x\,...,xn as 
independent random variables having the uniform distribution on [0, 2id\ d . We collect them 
into the sampling set 

X := {aq, .. .,x N }. 

Obviously, the cardinality \X\ equals the number of samples N with probability 1. 

Motivated by results of Candes, Romberg and Tao in we propose the following non-linear 
method of reconstructing / € n d(M) from its sampled values f(x\), ..., /(x/v). We minimize 
the fi-nornr of the Fourier coefficients Cfc, 

IIMli := E l c <=l> 

k&[-q,q] d 

under the constraint that the corresponding trigonometric polynomial matches / on the sam¬ 
pling points. That is we solve the problem 

min || (c fc ) ||i subject to g(xj) := Y c k e lk ' Xj = f(xj), j = l,...,N. (2.1) 

fee [—<?,<?] d 

This task - also referred to as Basis Pursuit [Hj - can be performed with efficient convex 
optimization techniques [3], or even linear programming in case of real-valued coefficients q,. 

Once all the coefficients Cfc, k € [— q, q] d , are known also / is known completely and can 
be evaluated efficiently at any point, e.g., with the Fourier transform for non-equispaced data 
developed by Daniel Potts et al. M- 

Surprisingly, there is numerical evidence that the above reconstruction scheme recovers / 
exactly provided the number of samples is large enough compared to the sparsity. Indeed, 
Figure EJ] shows a sparse trigonometric polynomial with 8 non-zero coefficients and N = 25 
sampling points while the maximal degree is q = 40, i.e., D = 81. The right hand side shows the 
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Figure 1: Original sparse trigonometric polynomial with samples (left) and reconstruction 
(right) 


reconstruction from the samples by solving the minimization problem EH- The reconstruction 
is exact! We refer to Section m for more information on the numerical experiments. 

Our main results are two theorems that give a theoretical explanation of this phenomenon. 
The first one treats any sparse polynomial in and the second one considers “generic” 

polynomials in the sense that the set T of non-vanishing coefficients is modelled as random 
set. Unexpectly, both results involve combinatorial quantities connected to set partitions. We 
will spend the next section introducing the necessary notation. 

2.2 Set Partitions 

We denote [n] := {1,2,..., n}. A partition of [n] is a set of subsets of [n] - called blocks - such 
that each j e [n] is contained in precisely one of the subsets. By P(n, k ) we denote the set of 
all partitions of [n] into exactly k blocks such that each block contains at least 2 elements. For 
example P(4, 2) consists of 

{{1,2},{3,4}}, {{1,3},{2,4}} and {{1,4},{2,3}}. 

Clearly, P(n,k) is empty if k > n/2. The numbers S 2 (n,k) = \P(n,k)\ are called associated 
Stirling numbers of the second kind. They have the following exponential generating function, 
see IT? , formula (27), p.77], 

OO L n / 2 l n 

Y Y S ^ n ' k ^ yk ~n\, = exp (y( eX ~ x ~ X ))- ( 2 - 2 ) 

n=l k=1 

Based on this one may deduce that the numbers S^n, k) satisfy the recursion formula 

S 2 (n, k) = kS 2 (n — 1, k) + (n — l)S 2 (n — 2,k — 1). (2-3) 

Also a combinatorial argument for this recursion exists, see Section 0] where also further infor¬ 
mation on the numbers S 2 (n, k ) will be given. 

We also need partitions of a different type. An adjacency is defined to be an occurence 
of two consecutive integers of [n] in the same block of a partition. Hereby, consecutive is 
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understood in the circular sense, i.e., also n and 1 are considered consecutive. We define 
U(n,k ) as the set of all partitions into k subsets having no adjacencies. For instance, 17(5,3) 
consists of the partitions 

{{1,4},{2,5}, {3}}, {{1,4},{2},{3, 5}}, {{1},{2,4},{3,5}}, 

{{1,3}, {2, 5}, {4}} and {{1, 3}, {2,4}, {5}}. (2.4) 

Clearly, U(n, 1) is empty. We remark that it was only very recently that D. Knuth m raised 
the problem of determining the number of partitions in U(n, k). 

We will also need a slight variation of this type of partitions. Let [AT] x [m] = {1,..., K} x 
{1,..., m} for some numbers K, rn E N. We denote by U*(K, m, s ) the set of all partitions of 
[A"] x [m] such that (p, u ) and (p,u+ 1) are not contained in the same block for all p = 
and u = 1,..., m — 1. (So this sort of consecutiveness is not understood in the circular sense.) 
We remark that U(K, 1 ,k) is the set of all partitions of a AT-element set into k subsets (without 
any restriction on the type of partition). In particular, the numbers \U(K,l,k)\ equal the 
(ordinary) Stirling numbers S(K,k ) of the second kind. The numbers b n := Y'.u — 1 S(n,k ) are 
called Bell numbers mum. 

Now let A = {A\, ... ,A t } be a partition in P(n,t) and B = {B\ ,...,A S } E U(n,s). By 
Ai +1 we understand the set whose elements are the ones of A t incremented by 1 in the circular 
sense, i.e., n + 1 = 1. We associate a t x s matrix M = M(A, B) to the pair A, B by setting 

Mij := |A i nA j |-|(A + l)nS J |, 1 <i <t, 1 < j < s. (2.5) 

Then we define Q(n,t, s, R) to be the number of pairs of partitions (A,B) with A £ P(n,t ) 
and B E U(n,s ) such that the rank of M(A,B) equals R, i.e., 

Q(n,t,s,R) := #{(A,B): A E P(n, t), B E U(n, s), rank M(A, B) = A} . (2.6) 

Observe that 

t t 

- \( A i + l)nBj|) = - | {1 ,... ,n} C\ Bj\ = 0 

1=1 1=1 

(since the AA s are disjoint) and similarly X]j=i A ^-,j = 0- Thus, the rank of M(A, B) is less or 
equal to min{s, t} — 1. In other words Q(n, t, s, R) = 0 if R > min{s, t}. 

Similarly, let {A, B) be a pair of partitions of [K] x [m] where A = {A \,..., A t } E P(Km, t ) 
(identifying [Km] with [K] x [m]) and B = {i?i,..., B s } E U*(K, m, s ). Let Ai — 1 denote the 
sets whose elements are {(p,u — 1), (p,u) E Ai}. In contrast to above we do not calculate in 
the circular sense this time. So elements of the form (p, 0) may appear in Aj — 1. Then to such 
a pair (A, B) we associate a matrix L = L(A, B) with entries 

u,i = XI (- 1 )' - E (- 1 )' < 27 ) 

(p,u)GAiHBj (p,u)£(Ai—l)nBj 


Finally, we define 

Q*(K,m,t,a,R ) := #{(A,B) : A E P{Km, t), B E U*(K, m, s), rankL(A, B) = R} . (2.8) 
Later in Section 0 we will provide some more information on these combinatorial quantities. 
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2.3 The Main Theorems 


In order to formulate our first theorem let F n (9),n £ N, denote the functions defined in terms 
of a generating function by 


^2 F n{9)— = exp(9(e x - x - 1)). (2.9) 

71=1 

Clearly, F n is connected to the associated Stirling numbers of the second kind S 2 (n, k) by (12.21) . 
We refer to Section 0] for a list of F 2n for n = 1,... , 6. Further, we define 

G n {9) := 9~ n F n {9). 

Also recall that D = (2 q + l) d . Then our first theorem about exact reconstruction of sparse 
trigonometric polynomials reads as follows. 

Theorem 2.1. Assume f £ H d (M) with some sparsity M £ N. Let x\,...,xn £ [0, 27r] d 
be independent random variables having the uniform distribution on [0,27r] rf . Choose n £ N, 
(3 > 0, k > 0 and K i,, K n £ N such that 

n . _ 

a := V (3 n/K ™ < 1 and —— < - —- M ~ 3/2 . (2.10) 

' 1 — k 1 + a 

m =1 

Set 9 := N/M. Then with probability at least 

1 - [d(3- 2u J2 G 2m Krn (9) + MK~ 2 G 2n (9) ) j (2.11) 

/ can be reconstructed exactly from its sample values f(x\),... , /(xjv) by solving the mini¬ 
mization problem m- 

We will illustrate the probability bound CUD later in Section 0 with some plots. In 
particular, the probability of exact reconstruction is high if the “non-linear oversampling factor” 
6 = N/M is large enough. Of course, in order to obtain useful results one has to optimize with 
respect to the parameters occuring in CUD - In particular, the choice of n is crucial. It may 
not be chosen too small but also not too large depending on 9. Indeed, pursuing this strategy 
leads to the following qualitative result. 

Corollary 2.2. There exists an absolute constant C > 0 such that the following is true. 
Assume f £ II^(M) for some sparsity M £ N. Let xi,...,xn £ [0,27r] rf be independent 
random variables having the uniform distribution on [0,27r] rf . If for some e > 0 it holds 

N > CM(\ogD + log(e -1 )) 

then with probability at least 1 — e the trigonometric polynomial f can be recovered from its 
sample values f(xj),j = 1,..., N, by solving the I 1 -minimization problem \2.1\) . 

This formulation is similar to the main theorem in [0] concerned with exact reconstruction 
in the context of the discrete Fourier transform. Indeed, setting e = D~ a yields a probability 
of exact reconstruction of at least 1 — D~ a provided N > CM (a + 1) log D. 
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We remark that (HZEH) of Theorem EH allows to actually compute precise bounds on the 
probability of exact reconstruction when the parameters M, N, D are given explicitly. But 
clearly, the previous corollary is easier to interpret. This is the reason why we have given both 
results. 

For our next theorem we model also the set T C [— q, q] d of non-vanishing Fourier coeffi¬ 
cients as random. So we will not treat arbitrary sparse polynomials, but only “generic” ones. 
The hope is, of course, that this provides even better estimates for the probability of exact 
reconstruction. 

Let 0 < r < 1. The probability that an index k G [—q,q] d belongs to T is modelled as 

P(fc G T) = t (2.12) 

independently for each k. We also assume that the choice of T and the choice of the sampling 
set X are stochastically independent. Clearly, the expected size of T is E\T\ = tD = r(2q + l) d 
and \T\ follows the binomial distribution. For convenience we also introduce IFr as the set of 
all trigonometric polynomials whose coefficients are supported on T. 

We also need some auxiliary notation. For n G N we define 

minjn,^} 2 n min{i,s}-l 

W(n, N,E\T\, D) := N~ 2n £ —— X>|T|)- £ Q(2n,t, s, R)D~ r (2.13) 

t=l '' s=2 R=0 

and for K, rn G N 

min {Km,N} 2 Km min{t,s} 

Z(K,m,N,E\T\,D) := N~ 2Km ]T J ^ ( E |T|r X] Q*m,m,t,s,R)D~ R . 

t =1 ^ '' a=l R =0 

Our second theorem about reconstructing a sparse trigonometric polynomial from random 
samples by Basis Pursuit is given as follows. 

Theorem 2.3. Let x \,..., xjv E [0, 2 ir] d be independent random, variables having the uniform 
distribution on [0,2-7r] cZ . Further assume that T is a random subset of [—q,q] d modelled by 
\2.1k\) (with T being independent of x\,... , xn) such that E|T| = tD > 1. Choose n G N, 
a>0, /3 > 0, k > 0 and K \,..., K n G N such that 

a ■= ^/r/^<l and ((a + 1)E|T|)“ 3/2 . (2.14) 

m= 1 

Then with probability at least 

1 - L- 2 W(n,N,E\T\,D)+P~ 2n D Z{K m , m, N, E|T|, D) + exp (^-^^E|T|^ 

(2.15) 

any f G Ilr C II^(|T|) can be reconstructed exactly from its sample values f(x i),... , /(xjv) by 
solving the minimization problem m- 

Of course, the theorem has to be understood in the sense that the set T is not known 
a priori because with the knowledge of T it would be in fact much easier to reconstruct /. 
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(Although it seems that in higher dimensions d >2 not many theoretical results are available, 
see e.g. 1,L) 

Like the previous theorem this result shows that the probability for reconstructing the 
original sparse polynomial is indeed high for appropriate choices for the number of sampling 
points N and the expected sparsity E|T|. We will illustrate this later in Section[5|by computing 
numerical plots for the bound in (12.151) . Since the theorem does not treat arbitrary /’s but 
only “generic” ones in the sense that the set T is random one may expect that the bound for 
the probability for exact reconstruction is better than the one in Theorem 12.1 1 As we will see 
later, this is indeed the case if one takes the same n, see also Section 15.(il Unfortunately, we 
were not able to compute the bound explicitly for n > 5 so that practically up to now Theorem 
EH gives the better bounds since here we are to evaluate the bound (EUD for any n. 

The reason for not being able to compute (I2T51) for arbitrary n is due to the fact that we do 
not have an explicit expression (or a recursion formula, or a good estimation) of the numbers 
Q(2n, t, s, R) and Q*(2K, m , t, s, R ). We were only able to compute them on a computer up to 
n = 4 by checking the rank of M(A , B) and L(A, B) for all possible pairs (A, B) of partitions. 
Already for n = 5 the computing times would exceed several days and with n = 7 at the 
latest the task nearly becomes an impossibility since the rank of 576535660478649 ~ 5.7 x 10 14 
matrices would have to checked for computing the numbers Q(14, t, s, R). So we have to leave 
it as an interesting open problem to provide more information on the numbers Q(2n,t, s, R) 
and Q*(2K, m, t , s , R ), see also Section EJ We hope that with a progress on this combinatorial 
problem we can improve significantly our probability bounds. 


Remark 2.4. (a) In both theorems it is reasonable to choose K m ~ for instance rounding 

m/n to the nearest integer. In this way mK m ~ n for all m and further 


J2 P n/Km 


m=l 


m=l 


P 

1-/3' 


Indeed, in the limit n —> oo all the above expressions become equal. As we require the left 
hand side to be less than 1, we should choose /3 approximately less than 1/2. Actually, a 
choice near 1/2 turned out to be good. 

(b) There is nothing special about the underlying set [— q, q\ d D Z d . Indeed, both theorem still 
hold when taking any other finite subset of Z d of size D instead. 

(c) If one is interested in choosing the dimension D = (2 q + l) rf of the problem very large 
then one may observe that 


min{n,iV} 


D 


lim W (n, N, E|T|, D) = N 


—2 n 


E 

i=i 


N\ 


2 n 


(N — t)\ 


£Q(2n,t,s,0) (E|T|) S 


and 


min{Km,N} 2 Km 

Jim^ Z(K,m,N,E\T\,D) = N~ 2Km £ ——- y £ Q(2K, m, t, s, 0) (E|T|) S . 

1=1 ' ' S = 1 

(Of course, we keep E|T| fixed in this limit so that r = D/E|T|, see \2.1‘A) . has to 
be adjusted in the process of passing with D to infinity.) This shows that the numbers 
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Q(2n,t,s,0) and Q*(2K,m,t,s,0) play the most important role in the probibility bound 
\2.15\) of Theorem, 1 2.1 fl In fact, the tables in the Appendix and Lemma EH indicate that 
these numbers are quite small for R = 0 compared to other values of R. 

(d) In practice, we usually do not have precisely sparse signals. However, signals that can 
be approximated by sparse ones may appear quite frequently (e.g. in the context of best 
n-term approximation). We leave the investigation of related questions to future contri¬ 
butions, see also E for the setting of the discrete Fourier transform. 


3 Proof of the Main Results 


We will develop the proofs of both theorems in parallel. The basic idea is similar as in the 
paper p! by Candes, Romberg and Tao. However, there are also significant differences and, in 
particular, it turns out that our approach leads to a simpler and slightly less technical proof 
(although still considerably elaborate). Also the idea of modelling the “sparsity set” as random 
is new and requires special treatment. 

Let us first introduce some auxiliary notation. By £ 2 {[—q,q] d ),£ 2 (T),£ 2 {X) we denote the 
I 2 space of sequences indexed by [— q,q\ d , T C [— q, q] d and X, respectively, endowed with the 
usual Euclidean norm. Moreover, we introduce the operator 

F x :f([-q,q] d )^f(X), T x c( Xj ) := £ c k e ik j = l,...,N. 

fce [— 


By Ttx '■ £ 2 (T) —> £ 2 (X) we denote the restriction of Tx to sequences supported only on T. 
The adjoint operators are denoted by T\ : £ 2 (X) —> £ 2 {{—q, q] d ) and Tf x : £ 2 (X) —> £ 2 {T). 

Clearly, our problem is equivalent to reconstructing a sequence c from T\c by solving the 
problem 

min||c / ||i subject to T x c r = Tic. (3-1) 

For c £ £ 2 {[—q, q] d ) we introduce its sign by 


sgn (c) k 


-—-r, k £ suppc, and sgn(c)fc = 0, k ^ suppc. 

\c k \ 


Hereby, supp c denotes the support of c. 

The key lemma for our proofs is the following duality principle. 

Lemma 3.1. Let c € £ 2 ([— q, q] d ) and T := suppc. Assume Ttx '• £ 2 {T) — > £ 2 (X) to be 
injective. Suppose that there exists a vector P £ £ 2 ([— q, q] d ) with the following properties: 

(i) P k = sgn c k for all k £T, 

(ii) |Tfc| < 1 for all k £T, 

(in) there exists a vector A £ £ 2 (X) such that P = T* x A. 

Then c is the unique minimizer to the problem m- 

Proof: The proof mimiques the one by Candes, Romberg and Tao mi Lemma 2.1]. For the 
sake of completeness we repeat the argument. 
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Let b be a vector with T x b = T x c and set h := b — c. Clearly, T x h vanishes. For any 
k G T we have 


\bk\ = \ck + h k \ = | |c fc | + %sgn(c) fe | > \c k \ + Re(h k sgn(c) k ) = \c k \ + Re(h k P k ). 
If k ^ T then |6^| = \h k \ > R e(h k P k ) since |Pfc| < 1. This gives 


||&||i > ||c||i + E Re(/i. fc Pfc). 

k£[-q,q] d 


Further, observe that 


Y R e(h k P k ) = Re 

fce[— q,q\ d 


E h ^x^)k 

k£[-q,q] d 


Re 




= 0 


since T x h vanishes. Altogether, we proved ||6||i > ||c||i, and thus c is a minimizer of em¬ 
it remains to prove the uniqueness. The above argument shows that having the equality 
||6||i = ll c lli forces \h k \ = R e(h k P k ) for all k (jiT. Since \P k \ < 1 this means that h vanishes 
outside T. Since also T x h vanishes, it follows from the injectivity of Ttx that h vanishes 
identically and hence, b = c. This shows that c is the unique minimizer of em. ■ 


Concerning the assumption on the injectivity of Ttx we have the following simple result. 
Lemma 3.2. If N > |Xj then Ttx is injective almost surely. 

Proof: The proof is essentially contained in 1, Theorem 3.2]. There it is proved that any 
|T| x |T| submatrix of Ttx has non-vanishing determinant almost surely (even under slightly 
more general assumptions on the distribution of the random variables x±, ..., icjv)- This implies 
the result. ■ 


Now our strategy for proving Theorem 12.31 is obvious. We need to show that with high 
probability there exists a vector P with the properties assumed in Lemma 13.11 To this end 
we proceed similarly as in j5j. (Actually, the injectivity of Ttx will also follow from this finer 
analysis so that Lemma 1,3.21 will not be needed in the end.) 

We introduce the restriction operator Rt ■ £ 2 ([—q,q] d ) —► I 2 (T), RTC k = c k for k G T. Its 
adjoint Rf = Et : £ 2 (T) — ► f 2 ([— q, q] d ) is the operator that extends a vector outside T by 
zero, i.e., ( ETd) k = d k for k G T and ( ET<l) k = 0 otherwise. 

Now assume for the moment that T^ x Ttx '■ I 2 {T) —> I 2 (T) is invertible. (By Lemma 13.21 
this is true almost surely if N > |Tj since Ttx is then injective.) In this case we define P 
explicitly by 

P := TxTtx{Tt X T tx )~ x R t sgn(c), 

where as before T = suppc. Then clearly P has property (i) and property (iii) in Lemma 13.11 
with 

A := TTxiTfx^Tx)- 1 Rt sgn(c) G I 2 (X). 

We are left with proving that P has property (ii) of Lemma 13.11 with high probability. 

To this end we introduce the auxiliary operators 

H : £ 2 (T) - f ([-q, q] d ), H := NE T - T* x Ttx 
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and 


H 0 : C 2 (T) - £ 2 (T), Ho := R T H = NI T - F* tx T T x, 

where It denotes the identity on £ 2 (T). Obviously, Ho is self-adjoint, and H acts on a vector 
as 

N 

(■ Hc) e = 

j =ifcer 
k^i 

Now we can write 

P = ( NE t - H) ( NI t - Ho)' 1 R t sgn(c). 

As we are interested in property (ii) in Lemma 111.II we consider only values of P on T c = 
[— q, q] d \ T. Since Rt^Et = 0 we have 

Pk = ~RtcH{I t - ~ Ho) -1 Rt sgn(c) for all k € T c . 

Let us look closer at the term (It — j^Hq)^ 1 . To this end let n £ N be some arbitrary number. 
By the von Neumann series we can write 


( /T " (^°) ) =I T + A n 

with 

oo 1 

r =1 

Using the identity 

(1 - M)- 1 = (1 - M n y 1 (l + M + ■ ■ • + M n_1 ) 


(3.2) 


(3.3) 


we obtain 


n—i . 


ra=0 


Thus, on the complement of T, we may write 

^ / LI— 1 


ifr-P = --#(/r +A n ) ( ^(iV- 1 ^)” 1 ] «rsgn(c 


Vm=0 


n—1 


^ (N- l HR T ) m sgn(c) - -^HA n RT £ ( N~ l HR T ) m sgn(c) 


ra=l 


m=0 


= -(p (1) + p (2) ), 


where 


with 


P (1) = S'nSgn(c), and P (2) = ^HA n R T (I + S n -i) sgn(c), 


:= ^ (N -1 HRtY 


m= 1 


u 


Our aim is to estimate P(sup fcgT c \Pf.\ > 1). To this end let «i,a 2 > 0 be numbers satisfying 
a\ + a ,2 = 1. Then 


P(sup \Pk\ > 1) < P ( {sup |P fc (1) | > ai} U {sup \P^p 
k&T c \ k£T c keT c 



Clearly, 


n\Pk ] \ >«o = 


(N-'HRt)™ sgn(c 


vm=l 


> Ol 


< 


^ |((lV- 1 ^ T ) m sgn(c)) fc | > or =: P(E fc ). 


V.772—1 


(3.4) 


(3.5) 


Consider P( 2 \ Denoting l°° = q\ d ) the space of sequences indexed by [— q, q] d with 

the supremum norm (and similarly defining £°°(T )) we have 


sup \p[ 2) \ < ||P (2) ||oo < \\^HA n \\ £ oo { T)^eoo(l + ||P T S' n _isgn(c)|| £ oo (T) ) (3.6) 

fceT c -' v 


In order to analyze the term sgn(c)||^oo( T ) we observe that similarly as in (13.511 


P(|(5„_isgn(c)) fc | > ai) < P ( ^ \((N 1 HR T ) m sgn(c)) fc | > ai ) = P (E k ). 


\m=l 


Let us now treat the operator norm appearing in (ESI. For simplicity we write || • ||oo instead 
of j| • ||#x>_^oo. It holds ||^4||oo = sup r | A rs |. Clearly, 

||t°° 5; || ~j^H\\oo ||A n ||^ 00 (T) ■ 


N 


Moreover, ||-iii’|| 00 < |T| as H has |T| columns and each entry is bounded by N in absolute 


value. 


In order to analyze A n we will work with the Frobenius norm. For a matrix A it is defined 


as 


\Af F := Tr (AA*) = \A r 


where Tr(AA*) denotes the trace of AA*. Assume for the moment that 

1 


'N 


H 0 ) n \\ F < k < 1. 


(3.7) 


Then it follows directly from the definition ESI of A n that 


IA 


n\\F — 


r= 1 




< y , <E s ' = i 


r =1 


r =1 


— K 


Moreover, since A n has |T| columns it follows from the Cauchy-Schwarz inequality that 

II4.IIL < su P |t|£|4>(U)I 2 < |r|||A„|ft. 
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So assuming (1X71) and ||5 n _i sgn(c)|| 00 < ai we have 


sup | P k 

teT c 


( 2 ), 


< (l + ai )|T| 3 / 2 --- 

1 K/ 


In particular, if 


K 

- < 

1 — K 


02 

1 + 0-1 


|r|-3/2 


(3.8) 


then supjgyc || < 02 as desired. Also it follows from (IP) that k < 1 as \T\ > 1 without 
loss of generality (if T = 0 then / = 0 and K 1 -minimization will clearly recover /.) 

Now we have to distinguish between the situation in Theorem 12.II and the one in Theorem 
IQ since in the latter \T\ is a random variable while in the first it is deterministic. 


1. Let us first treat the case of Theorem IP where \T\ is random. If 


\T\ < (a + 1)E|T| 


with a > 0 and 

< TT~^ a + 1)E|T|)- 3 / 2 (3.9) 

1 — n 1 + a\ 

then clearly El is satisfied and consequently 

1 d( 2 ) 1 ^ 

sup | P> ’\ < a 2 . 


Using the union bound we altogether obtain from eu 

P(sup |Pfc| > 1) < p ( M > ai} U {||i?TSgn(c)||£oo (T) > ai} U {||(IV _ 1 ILo) ?1 ||f > 

\k£T c 

U{|T|>(a + l)E|T|}V 

<Pl \J Eh U {||(IV _1 iLo) n ||i? > k} U {|T| > (a + 1)E|T|} 

\ke[-q,q] d 

< J2 P(^fc) + P(||(^ _1 flo) n ||F>«)+P(|T|>(a + l)E|r|). (3.10) 

kG[-q,q] d 


As |T| is the sum of independent random variables we obtain for the third term from the 
large deviation theorem (see for instance equation (6) in (21) where also slightly better 
estimates are available) 


P(|T| > E|T|+aE|T|) < exp (-(aE|T|) 2 /(2E|T| + 2(aE|T|)/3)) 

3a 2 


= exp 


6 4- 2a 


E|T| 


(3.11) 


So we are left with the two other expressions in (Eni- 


is 








2. In the situation of Theorem EH we proceed in almost the same way with the only differ¬ 
ence that we do not need to treat |T| as random variable. Under the condition in (Id.811 
this yields 

P(sup \P k \ > 1) < V V(E k ) +P(||(iV- 1 i/o) ?l ||F > «). (3.12) 

fc€TC fcG [-q,q] d 

Hence, also here we need to estime P (E k ) and P([|(./V _1 llo) n ||.F > k). 


3.1 Analysis of powers of H 0 

In this section we treat the second term in (Hump and firm i.e., we estimate powers of the 
random matrix Hq in the Frobenius norm. To this end Markov’s inequality suggests to estimate 
the expectation of \\Hq In the following lemma we only take the expectation with respect 
to the random sampling set X = {x\,. .. ,xn}. For the situation of Theorem 12.31 we postpone 
the computation of the full expectation E = E^Ex (the latter by Fubini’s theorem). 

Lemma 3.3. It holds 

min{n,jV} / 

ex [iiff„il] = L jw~tv E E II 5 E^+i - fc -) 

t= 1 V ' A&P(2n,t) ki,...,k 2 n£T AeA \r£A 

k j¥= k j+i,j£[ 2n \ 

where 5(n) denotes the Kronecker 5o n and k 2n +i = k\. 

Proof: As Hq is self-adjoint we need to estimate \\Hq |||, = Tr(l^Q n ). Observe that 

H 0 (k,k') = h{k'-k), k, k' <E T 


with 


N 


h{k ) = -5(k)^e ik - Xe . 


i=i 

Thus, H$(k, k') = Eter,^fc,fc' h(t - k)li(k' - t ) and 

h) = h ^ - fe i) • • • Hh - k 2n ) 

k 2,--;k2n(zT,kj^kj+l 

where we agree on the convention that k 2n +i = k\. This yields 

Tr(H$ n ) = h(k 2 - h)h(k 3 - k 2 ) • • • h(k i - k 2n ). 

k l yi k 2n £T>kj ^kj + l 

Using linearity of expectation and the definition of h we get 


N 


Ex [Tr (H^)] = J2 E E * 


U,--.,^2n = l fcl,...,fc2nGT 
k j¥= k j + l 


2 n 


exp i y~^(fc r +i — k r 


■ x ir 


r =1 


Let us consider the latter expected value. Here we have to take into accound that some of the 
indeces t r might be the same. This is where set partitions enter the game. 
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We associate a partition A = (Ai,..., A t ) of {1,... , 2 n} to a certain vector (li ,..., £ 2 n) 
such that £ r = £ r > if and only if r and r' are contained in the same set A; £ A. This is allows 
us to unambiguously write £a instead of £ r if r £ A. The independence of the x^ A yields 


E a - 


2 n 


exp * X^ v+i _ kr 


■ Xt r 


r=1 


= Ea 


exp * S X^+i - At) ‘ x £a 
\ AgAtgA / 


II E ' 

AgA 


exp ( i ^2(k r+ 1 - k r ) ■ x lA 
\ rGA / 


Since X£ A has the uniform distribution on [0, 27r] d we obtain 


Ea 


exp i ) (k r+ 1 — k r ) ■ X£ a = / exp i N (fc r +i — k r ) ■ x \ dx 

\r7i /J ■W]" \ rGA y 


(3.13) 


h I ^ ( ( k r +1 Ay 

VreA 


(3.14) 


Observe that the last expression is independent of the precise values of the i r . Only the 
generated partition A plays a role. Moreover, if A £ A contains only one element then CM 
vanishes due to the condition k r+ \ A k r . Thus, we only need to consider partitions A satisfying 
|A| > 2 for all A £ A, i.e., partitions in P(2n, t). Moreover, observe that the number of vectors 
(Iaj, • • • AA t ) £ {1, • • • ,N}* with different entries is precisely N ■ ■ ■ (N — t + 1) = N\/(N — f)! 
if IV > t and 0 if N < t. Finally, we obtain 


E x[\\H ( 


n 11 2 1 

o IIfJ 


minjn, N} 

E 


E E IP E< k 


r+1 


— k r 


t= 1 AGP(2n,t) fci,...,fe 2 nST AgA 

kjf -/-kj f 1 


rGA 


which is precisely the content of the lemma. 

In view of the previous lemma we define for simplicity and later reference 


C(A,T) : = E IP E (Ay-|-i k r 


(3.15) 


k 1 ,...,k 2 n&T AgA \rGA 

kj^kj + i 


# < (Ay,..., k‘ 2 n ) € T 2n : kj A kj + \,j £ [2n], and ^(Ay+i — k r ) = 0 for all A £ A > . 

I rgA J 


3.2 Analysis of P(^) 

Let us now treat the first term P(£*.) in (13.1011 resp. (13.121) . To this end let f3 m ,m = 1,..., n, 
be positive numbers satisfying 

n 

Yjm = ai 

m= 1 
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and K m G N, m = 1,..., n, some natural numbers. Let k G [— q, q] d . Using Markov’s inequality 
in the last step we obtain 


1 2 K„ 


N 


—2mK n 


(3.16) 


(3.17) 


P (E k ) = P ^ KiN -1 HRt) 171 sgn(c))fc| > fll ) < X] Rt) 171 sgn(c))fe| > (3 m ) 

\m= 1 / m= 1 

n 

= p {N- 2mK ™\((HR T ) m sgn(c))fc| 2 ^ m > /) 

m— 1 
n 

< ^ E [|((i 7 ^T) m sgn(c)) fc | 2Am ] N - 2mKm p- 2Km . 
m= 1 

Let us choose f3 m = /3 n ^ Krn , i.e., /3ff 2Krn = /3~ 2n . This yields 

n 

P (E k ) < /?- 2n E E 0 «HR T rsgn(c)) k 

m= 1 

and the condition a± = Pm reads 

n 

ai = a = fd n/Km < 1. 
m=l 

The following lemma is concerned with the expectation appearing in (ETT7I . We first investigate 
the expectation with respect to X. The following proof is similar to the one of Lemma 13.31 

Lemma 3.4. For k G [— q, q] d and c G £ 2 ([— q, q] d ) with suppc = T we have 

Ex [\((HR T ) m S gn(c)) k \ 2K ] 

min{Km,N} / 

* E E E n*( 


t =1 


AeP(2Km,t) k w th M eT AeA \(r,p)eA 


7,(2 K) ,(2K) 


with k q P) := k for p = 1,..., 2 K. Hereby, we identify partitions of [2Km] in P(2Km , t) with 
partitions of [2 K\ x [m] in an obvious way. 

Proof: Set a := sgn(c). An elementary calculation yields 

N 

((HR T ) m a) k = (-l) m XZ E ff(fc m )e i(fcm_fcm - l) '* <m -"e i ( fcl “ fc o)’ a,< i 


£i,...,£ m =l ki,...,k m eT 

kj-i^kj,j=l,...,m 


with fco := k. Thus, 


N 


N 


mR T rc r) feo i 2 E E 


E 


a{k^)a{k ( m ) ) X 


/?(!) /?(!) — i ^(2) />(2) — ^ iu(l) (z'T' 


*4^ 1 ^A: - p) J € H ,p= 1,2 


-iEr=i(4 2) -4-i)'V2) 
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where k 


(i) 

o 


,( 2 ) 


fco = k. Taking a 2ii-th power yields 


\((HR T ) m S gn(c)) k \ 2K 


= Y <r( k m) <T ( k m ) )--- e r( k m K 1 ) )v(k£ K) )x 


4 J) , 

i 

GT 

>•••} "'in v=-/. 

ef K) ,. 


1 k[ 2K) ,...,k£ K) GT 



k^k^ 

I 

' IK 

m 

exp 

<E<- 

- k 

\ 

k p =1 

r= 1 


with k q P ^ = k, p = 1,... ,2 K. Further, recall that |cr(fc) | = 1 on T. Taking the expected value 
Ex yields 


Ex [|((iTR T ) m sgn(c)) fc 


12 Kl 


N 


s E E E * 

fi 1 ) /( 1 )_1 JuC 1 ) ut 1 ) ^ r f 

i-rm — x fi'i vj^m 


2K 


ex p I *5Z( _1 ) P X^ P) _ fc i-i 


x n(p) 


p =1 


r=l 


(3.18) 


„(2X) \(2K) (2K) ’ (2K) „ 




(with equality if all the entries of <r are equal on T). 

Let us consider the expected value appearing in the sum. As in the proof of Lemma 13.31 

(p) 

we have to take into account that some of the indeces £f might coincide. This affords to 
introduce some additional notation. Let C {1,..., N} 2Km be some vector of 

indeces and let A = {A\,... ,A t ), Ai C {1,..., m} x {1,... 2 K} be a corresponding partition 
such that (r,p) and (r ; ,p') are contained in the same block if and only if 6^ = \ For some 

A £ A we may unambigously write £a instead of £^ if (r,p) £ A. 

Like in (EH, using that all £\ for A £ A are different and that the xg A are independent 
we may write the expectation in the sum in (ETTK1) as 


E 


2 K m 


ex p I - k -r-\) • 


p= 1 r= 1 


ri E 

AgA 


x »(p) 


exp I ^ Y (- 1 ) P ( fc r P) - ^r-l) • X £a 
( r,p)GA 


ip e £v r (,kM - kp\) 

AgA \(r,p)GA 


Once again, if A £ A contains only one element then the last expression vanishes due to the 
condition k^ A k!f}_ 1 . Thus, we only need to consider partitions A in P(2I\m,t). Now we are 
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able to rewrite the inequality in CM as 
[\{{HR T ) m a) k \ 2K ] 

N 

E 


Km 

sE E 

t= 1 AeP{2Km,t) 1 fc( 1) ,...,fcWgT Ae -4 V(r,p)eA 

different 


E ITI E (- 1 ) P (^ P) - *i pl i 


,(2K) ,(2K) T 

ft/ ]_ v*5 ^- L 


^ (N-t)\ 2s 

t=1 K ’ AeP(2Km,t) 

k (p \ +k 

O) 

7 

E 

ip 


Ag.4 


.(2if) ' (2if) „ 






Ip) 


This proves the lemma. 

In view of the previous lemma and for the sake of simple notation we denote 


B(A,T) := Y 1 Ip ]T (-ink^-k 

&.0) A^iA V (r',^>)€=^4_ 


(p) \ 

i— 1 ) 


(3.19) 


1.(2 K) .(2 K) 

,.. • ,rb m ei 

fc^W^ p) deH, P e[2K] 


3.3 Proof of Theorem 12.11 

Let us assemble all the pieces to complete the proof of Theorem 12.11 By Lemma 13.31 we 
need to investigate the quantity C(A, T) defined in (13.151) for A € P(2n,t). Here the indeces 
(fci,... ,k 2 n) £ T 2n are subjected to the \A\ = t linear constraints D) re x(^V+i — K) = 0 for 
all A G A. These constraints are independent except for Ylr=i(^r+i ~ K) = 0. Thus, we can 
estimate 

C{A,T) < \r\ 2n ~ t+1 < M 2n ~ t+1 . (3.20) 

By Lemma ld~3l we obtain (note that in the situation of Theorem 12.1 1 T is not random, so 
E = Ex) 


min{n,iV} n 

E 1 IWI&] < E TXVTh E m 2 ” +1 -‘ < M 2 "+ , ^(»/M) l S ! ( 2 n.i). 

t=i ' '■ AeP(2n,t) t= l 

where S2(n,t) = |P(2n, t)| are the associated Stirling numbers of the second kind. Set 0 = 
N/M. From the generating function (12.21) of the numbers S 2 (n,k) we know that 

n 

Y J S 2 {2nA)e t = F 2n (9) 

t =l 
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with F 271 defined by (EH- Markov’s inequality yields 

P(||(JV- 1 tfo) n || F > k) = P(||fly||| > N 2n n 2 ) 

< N~ 2n k~ 2 E[\\Hq \\^r] < K- 2 M6- 2n F 2n {6) = K- 2 MG 2n (6). 

We remark that by cm we have n < 1. In the event that \\(N 1 H( i ) n \\p < k this implies that 
(It — (IV -1 Hq) h ) is invertible by the von Neumann series and by (13.31) also 

FtxFtx = N(I T - N-'Hq) 

is invertible. In particular, Ftx is injective. So this basic condition in Lemma Id.ll is satisfied 
automatically with a probability that can be derived from the estimation above, and we do 
not even need to invoke Lemma EZ2 

Let us now consider F(E k ). By Lemma ITll we need to bound B(A,T) defined in (13.1911 . 
i.e., the number of vectors (k^) E T 2Am satisfying E(rp)e, 4 ( _ ^) P (^ — = 0 for a H 

A £ A with A £ P(2Km,t)- These are t independent linear constraints. So the number of 
these indices is bounded from above by \j'\ 2Km ~ t < M 2Km ~ t . Thus, similarly as above we 
obtain 

Km 

E[\((HR T ) m sgn(c)) k \ 2K ] < Y J N t S 2 (2KmA)M 2Km - t = M 2Km F 2Km (0). 

t =1 

By (13.171) this yields 

n n 

HE k ) < r 2n Y o~ 2mKm F 2mKm (6) = r 2n Y m Km (e). 

m =1 m= 1 

Let P(failure) denote the probability that exact reconstruction of / by l 1 -minimization fails. 
By Lemma rm im and by the union bound we finally obtain 

P(failure) < P ( {Ft y is injective} U {sup \P k \ > 1} ) 

V fc€T c J 

n 

< Y +P(ll(^“ 1 A r o) n ||F > K) < Dr 2n Y G *nK m (e) + K~ 2 MG 2n (6) 

ke[-q, q] d m=l 

under the conditions 

n 

a\ = a = Y^ /3 n / Km <1, a 2 + a\ = 1 i.e. a 2 = 1 — o, 

m= 1 

< ° 2 M ~ 3 / 2 = i^Af- 3 / 2 , 

1 — k 1 + ai 1 + a 

see (13.81) . This proves Theorem 12.II 

3.4 Proof of Theorem 12.31 

Recall that here T is a random set modelled by Em. The completion of the proof of Theorem 
no will be slightly more complicated as above because we still need to take the expectation 
with respect to the set T in Lemmas 13.31 and 13.41 Let us start with the expectation of C(A, T) 
defined in (13.151) . 
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Lemma 3.5. For A £ P(2n,t) it holds 


n min{s,t} — 1 

E[C(A,T)\ < ^(E|T|) S Y D~ r #{B £ U(2n,s),rankM(A,B) = R}. 

R= 0 


s=2 


Proof: Using linearity of expectation we obtain 


E [C(A,T)\ = E 


E 


2 n 


n i {kpT} n 5 (kr+l hr) 


_k 1 ,...,k 2 n e[-q,q] d ,k j ^kj +1 j— 1 A&A 

2n 

y e n 

fclr--i fc 2nG[-g,g] d ,fc 3 ^fc 3 + l b' =1 


r&A 


n4E(^- fc 4 

A£A \reA / 


Hereby, I{keT} denotes an indicator variable which is 1 if and only if k £ T. The expression 


E 


nj = 1 hkj&T} 


depends on how many different k f's there are. So once again partitions enter 


the game. If (k±,..., & 2 n) £ ([—c/, q] d ) 2n is a vector satisfying kj ^ kj + \ then we associate a 
partition B = {B i,..., B s ) of {1,..., 2 n} such that j and j' are in the same set B{ if and only 
if kj = kj>. Obviously, j and j + 1 must be contained in different blocks for all j due to the 
condition kj ^ %+i (once again we agree on the convention that 2n + 1 = 1). In other words 
B has no adjacencies, i.e., B £ U(2n,s). Now if B has \B\ = s blocks then by the probability 
model (tmn) for T and stochastic independence 



" 2 n 


S 

E 

n 

= E 

Uh^T} 


_i =1 


J= l 


3 = 1 


}] = 


(3.21) 


where (unambiguously) ksj = ki if i £ Bj. We further introduce the notation a g(r) = j if and 
only if r £ Bj £ B. This leads to 


2 n 


e\c(a,t)\ = xy E E II 6 ( E^.o-m - k -,iA ■ 

s=2 B£U(2n,s) k 1 ,...,k s e[-q,q] d ^S.4 VreA / 

fcip.w. different 


Clearly, the expression CCreA(^o- B (r+i) ~ ^o- B (r)) is 1 if an d only if 


Y^sir) ~ K B (r+ 1)) = 0 for all A £ A (3.22) 

reA 

and 0 otherwise. For j £ {1,..., s} the term kj appears | Ai fl Bj | times as fco- B (r) when r runs 
through Ai £ A. Let M = M ( A , B) denote the txs matrix whose entries are defined by 
(12.511 . Then (13.2211 is satisfied if and only if (k\, ..., k s ) £ ([— q, gj^) 5 is contained in the kernel 
of M(A , B). Thus, if the rank of M(A , B) equals R then the number of vectors (k\, ..., k s ) £ 
([— q, q] d ) s for which (13.221) is satisfied can be bounded by D S ~ R where D = (2 q + l) d . (Here 
we even neglected the condition that the k\,... ,k s should be pairwise different). So finally we 
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obtain 


n min{s,t} — 1 

E[C(A,T)] < IT* £ D a ~ R #{B G U(n,s), rank M(A,B) = R} 

s=2 R=0 

n min{s,t}—1 

= E( E I T I y E D~ r #{B G U(n,s), rank M(A, B) = R}, 

s=2 R =0 

where we substituted E|T| = tD. ■ 

Since E = E.yIEt by Pubini’s theorem and stochastic independence of T and X the previous 
result yields together with Lemma 111.HI 


nimi] 

min{n,N} 


N\ 


n 


min{s,£}—1 


< E T^t)\ E E( E I T I) S E D- R #{BeU(n,s),i a nkM(A,B) = R} 

t =1 ^ >' A&P{2n,t) s=2 


R= 0 


min{n,iV} 


N\ 


min{s,i}—1 


E W^tyX^ {my E D~ R Q(2n,t, s, R) = N 2n W(n, N,E\T\, D) 

t =1 ' '■ s=2 R =0 


by dehnition (12.61) of the numbers Q(2n,t,s,R) and by definition (12.1 HI) of the function W. 
Markov’s inequality yields 


F(\\(N^H 0 r\\ F >R) < N- ln K~ z ¥\\\H^ F \ < k~ z W (n, N, E|T|, D). 

We remark that by the same argument as in the proof of Theorem 12..'ll Ft v is injective in the 
event ||(./V^ 1 iLo) Tl ||.F < 1- 

Let us turn now to the estimation of P (E^). From Lemma 13. 41 one realizes that we need to 
estimate the expected value of B(A,T) defined in (13.191) . 

Lemma 3.6. For A G P(2Km,t ) it holds 


2 Km 


min{s,t} 


E[B(A,T)] < E( E I T D S E D~ R #{B€U*(2K,m,s),rankL(A,B) = R}. 


S—1 R= 0 

Proof: As in the proof of the previous lemma we may write 


E [B(A,T)\ = E 


E 


(^ P) )€ H-q,q ] d ) 2 


n 


(p,j)e[2A']x[m] 


I {k l j p) ET} 


n* E (-im 00 - P\) 

AeA \ (r,p)GA 


Once again E 

,(i) A 2K ) 


n 


L 


(Pd)e[ 2.K]x[m] { fc (w eT } 


depends on how many different s there are. So if 


( k[ ,... ,km ) G ([— q, q\ d )( 2Km ) i s a vector satisfying 

^ k^} 1 for all j G [m], p G [2A'] 


(3.23) 
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then we associate a partition B = {B\,... ,B S ) of [2 K\ x [m] such that (p,j) and ( p',j’) are 
contained in the same block if and only if k^ = fcjf \ Obviously, ( p,j ) and ( p,j — 1) cannot 
be contained in the same block due to the condition cm. In other words, B belongs to 
U*(2K, m, s). Now, if B has s blocks, i.e., there are s different values of k^\ then 


E 


n I {k ( f ) eT} 
(p,j)£[2 K \x[m\ 


T 


s 


as in (E23J- Once more, we use the notation erg (p, j) = i if (p, j) G Bi G B and a(p, 0) = 0. 
(Recall that by definition k^ = ko = k.) Thus, 


E [B(A,T)] 


2 n 

E t ' e e n< 

S=1 BeU*(2K,m,s) ki,...,k s G[—q,q] d AgA 

hi p.w. different 


E ( l) P (^Vs(pd) 

(pj)ga 


k<TB(p,j- 1 )) 


The term FI AgA 5 (^{ P ,j)GA(- l ) V (K B ( J > 1 j ) 


^o-e(pd-i))) 


contributes to the sum if and only if 


^ ^ ( 1 Y'(^rTfi(p-j) l)) b for all A G A. 

(p,j)£A 


By definition O of the matrix L(A. B) and since k$ = k this is equivalent to 


L(A,B)(h,...,ks) T = kv(A,B), 


(3.24) 


where v = v(A, B) is the t-dimensional vector with entries 

Vi = Y (“ 1 ) P : i = !,-■■,t- 

(p,i)eAi 


(If d > 1 then (13.2411 has to interpreted vector-valued, i.e., for each component of k G [— q, q] d 
and of k \,..., k s G [— q, q] d we have one equation with the same L(A, B) and the same v(A, £>).) 
If the rank of L(A, B) equals R then we can bound the number of solutions to (13.241) by D S ~ R . 
Hence, we obtain the bound 


2 Km min{.s,£} 

E [B(A,T)] < E t ‘ E D s ~ r #{B G U*(2K, m, s), rank L(A, B) = R}. 

S—1 R =0 

Since E|T| = tD this proves the lemma. ■ 

Together with Lemma 13.41 the previous result yields 

min{/Cm,Af} 2Km min{s,£} 

E [| {{HR T ) m a)k\ 2K \ < E nv3H!^ (E|T|r £ Q*^K,m,t,s,R) D~ R 

t =1 ' S— 1 R=0 

= N 2Km Z(K,m,N,E\T\,D) 
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where Q*(2K,m,t,s,R ) are the numbers defined in (12.81) . By (18.171) we obtain 


HE k ) < P~ 2n J2 Z(K m ,m,N,E\T\,D ) 

m= 1 

Finally, let P(failure) denote the probability that exact reconstruction of / fails. By Lemma 
rm on . ietth and using that {Ttx is not injective} C {||(iV 1 Ho) n \\F > «} we obtain 

P(failure) < P ( { Ttx is injective} U {sup \Pk\ > 1} 

V fceT c 

< J2 + p (ll(^ _ljv o) n ||F > k) +P(|T| > (a + 1)E|T|) 

ke[-q,q\ d 

n o 2 

< Df3~ 2n V Z(K m , m, N, E\T\,D ) + n~ 2 W(n, N, E\T\,D ) + exp(- — — E\T\) 

' b + 2 a 

771—1 

under the conditions 

n 

a\ = a = ^2 (3 n / Krn < 1 , «2 + = 1 i.e. 02 = 1 — a, 

777.— 1 

T—— < 7j-((a + 1)K|T|)- 3 / 2 = i^((a + l)E|r|)- 3 / 2 , 

1 — k 1 + ai 1 + a 

see (18111) . This proves Theorem 12.81 


3.5 Proof of Corollary 12.21 

We have to show that a finer analysis of the probability bound <mn) of Theorem EH gives 
Corollary 12.21 We first claim that the associated Stirling numbers satisfy the estimate 

£ 2 ( 71 , k ) < (3n/2) n ~ k for all k = 1,..., |_n/2j. (3.25) 

Indeed, the claim is true for 62 ( 1 , k) = 0 and S < 2 ( 2 ,1) = 1. Now suppose, the claim is true for 
all S'ii'fri. k) with m < n. Then from the recursion formula (12.31) it follows 


S 2 (n, k) = kS 2 in — 1 , k) + (n — l)S , 2 (n — 2 , k — 1 ) 

< k(3{n - l)/2) n - fe - 1 + (n - l)(3n/2 - < (n - 1 + fc)(3n/2 ) n - fc - 1 

< (3n/2) n ~ k 


since n — 1 + k < 3n/2. This proves (13.251) . Pluggin this into the definition of G 2 n yields 

77 . n n 

G 2 n(0 ) = e~ 2n ^2s 2 (2n,k)e k < e- 2n ^2(3n) 2n ~ k 0 k = (3n/0) 2n ^2(9/3n) k 


k= 1 

= (3 n/9) 


k =1 
= (3 n/6f 


k =1 


2n (<0/3n )"+ 1 - (0/3n) /m2n _! (0/3n)" - 1 


(9/3n) — 1 (0/3n)-l‘ 

Now assume we have chosen n such that n < 0/6. Then we further obtain 

G 2n (0) < (3n/9) n ~ 1 . 
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Now consider the term Df3~ 2n J2m =1 G 2 mK m (9) from the probability bound (12.111) . We choose 
K m = r(n/rri) where r denotes the function that rounds to the nearest integer. Then it is easy 
to see that 


mK m £ {[2n/3"|,... , [4n/3j}, m € {1,... ,ra}. 


Thus, 


E G 2 mK m (9) < n max G 2 h(9) < n 
K ; fce{r2n/3l,...,L4n/3j} ^ ' 


RA 2 "/ 3 " 1 

tJ 


provided k < 9/6 for all k £ {[2n/3],... , [4n/3_|}, i.e., 6[4n/3j < 9. This yields 


D(3~ 2n E G *mK m (d) < Dn 


m= 1 


4n\ 

T 


-i 


P 


3 4n \ 2n/3 

t) 


In order to make this expression small it is certainly a good strategy to make the last term 
smaller than 1. Indeed, choose 



(3.26) 


implying f3 3 An/9 < 1/2. (This choice for n is certainly valid since j3 < 1 as it must satisfy 
condition cm) We obtain 


n 1 

D/5- 2n Y J G 2mK m {9) < -DO2~ 2n ^/\ 

m= 1 


A simple calculation yields that the latter term is less than e/2 if 

- —n(9) — ln(0) > ln(H) + ln(e _1 ) — ln(2). 

Furthermore, a simple numerical test shows that a valid choice for jd is /3 = 0.47. The cor- 
responding a = Ylm=i P n ^ Krn is always less than 0.957 and n(9) ss [0.013 9\. Recalling that 
9 = M/N it follows that there exists a constant C\ such that D/3~ 2n X^m=i G 2 mK m (,9) < e/2 
provided 

N > C\M (\n(D) + ln(e -1 )). 

Now consider the other term MK~ 2 G 2n (9) in the probability bound ( 12.1 ID . We choose k such 
that there is equality in CJP , i-e., 

= (1 -a)/(l -a)M~ 3/2 l-a 3/2 

1 + (1 — a)/(l + a)M -3 / 2 “ 2(1+a) 

Hence, 

MK~ 2 G 2n {9 ) < (^^y) M*G 2n (9) 

Now we do not have the freedom anymore to choose n. We have to make the same choice 
(1+261) as above. This yields 


Mk 2 G 2n (g' ) (9) < 


(2(1 +a) 

V (1 -a) 


2 

M 4 



n(6 )—1 
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Requiring that the latter expression is less than e/2 is equivalent to 

(n(9) - 1) In > ln ^8 (l^) ) + 4in(M) + ln(e _1 ). 

As already remarked the choice (3 = 0.47 results in a < 0.957 and n(6) rs [0.013 0J. Hence, 
ln(8/(3/3 3 )) rs 3.2459 and ln(8((l + a)/( 1 — a)) 2 ) « 9.7153. Since M < D there exists 
a constant C 2 (whose precise value may be calculated from the numbers above) such that 
Mk^ 2 G 2 u ^(9) < e/2 provided 

N > C 2 M(ln(D) + ln(e -1 )). 

Choosing C : = m&x{Ci,C 2 } completes the proof of Corollary 12.21 

We remark that analyzing numerical plots for /?~ 2n, ( 0 ) Ylm=l ^ 2 mA' ra ( 0 )(^) and G 2 n '( 0 ' ) (O) 
for n'{6) = [0/12] indicates that one may choose the constant C much smaller as the ones 
resulting from the theoretical analysis above. It seems that C ;$ 20 is a valid choice. 

3.6 Remarks 

We conclude this section with some remarks. 

(a) Let us give a more detailed reason why we believe that the probilistic model for the 
“sparsity set” T is likely to give better probability bounds for exact reconstruction than 
the deterministic approach holding for all T of a given size. Indeed the main difference 
in the two previous proofs lies in the estimation of C(A, T) and B(A,T ) defined in 
(1XT51) and (13.191) . If |*4| = t then for deterministic T we used the estimation (13.201) . i.e., 
C(A, T) < \T\ 2n ~ t+1 . Indeed, if T is an arithmetic progression then C(A, T) may come 
very close to this upper bound. However, for generic sets T the bound is quite pessimistic. 
In fact, in the probabilistic model the expected size of C(A, T ) can be bounded by 

n min{s,t}—1 

E [C(A,T)} < ^(E|T|) S D~ r #{B € U (2n, s), rank M(A, B) = R}, 

s =2 R =0 

see Lemma 13.51 In particular, if D is large (and E|T| not too small) then the latter 
estimate should be much better. Let us illustrate this with two examples. 

1. Let A = {{1, 2,3, 5}, {4, 6}}, i.e., 2n = 6 and t = 2. Then (13.201) yields C(A,T) < 
|T| 5 while computing 13.271) explicitly gives 

E[C(A, T)} = D - 1 [(E|T|) 2 + 10(E|T|) 3 + 20(E|T|) 4 + 9(E|T|) 5 + (E|T|) 6 ] . 

Clearly, if D is sufficiently large then the probabilistic estimate is much better than 
the deterministic one. 

2. Let A = {{1, 2, 3}, {4, 5, 6}}, so again 2n = 6 and t = 2. Then the deterministic 
estimate gives again C(A,T) < \T \ 5 while (13.271) results in 

E [C(A,T)] < (E|T|) 2 + 3(E|T|) 3 + (E|T|) 4 

+ D - 1 [7(E|T|) 3 + 19(E|T|) 4 + 9(E|T|) 5 + (E|T|) 6 ] . 

So here one has to choose both E|T| and D » E|T| large to see that potentially 

the probabilistic estimate is much better. 
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(b) Discrete Fourier transforms: The whole proofs work without essential change if one 
replaces our setting by the following one similar to the situation investigated by Candes, 
Romberg and Tao in 0. Consider functions on the cyclic group Z^ = {0, ... ,p — l} rf , 
p € N, rather than on [0, 2tt] c1 . The discrete Fourier transform is defined by 

/M := E u; € Z* 

xeZp 

We draw x\,... ,xn from the uniform distribution on Z^. Note that in contrast to 
sampling from [0, 27r] d it may occur with non-zero probability that some elements of Z^ 
are drawn more than once. But this will not do much harm. 

Let / be such that / is a sparse vector on Z^. Once again we try to reconstruct / from 
its sample values f(xj) by minimizing the £ 1 -norm of / under the constraint that the 
observed values f(xj ) are matched. 

Theorems rm and 12,.‘II will also apply to this situation. Indeed, the only thing that differs 
in the proofs is that we have to calculate modulo p in the definition of C(A,T ) and 
B(A, T ), see (111.151) and (fl.H)|) . This is apparent from (EH) where the integral is replaced 
by a sum of exponentials. Nevertheless, the deterministic and probabilistic estimates 
for the quantities C(A,T ) and B(A, T) still hold and so everything goes through in 
completely the same manner. 

Of course, one can also exchange the role of / and /, aiming at reconstructing a sparse 
signal on Z^ from random samples of its Fourier transform. Indeed, this situation is 
investigated in [§] with a different probability model for the sampling points. In other 
words, we presented a slightly different approach for the main result in JH). 


4 Some more on set partitions 

From Theorem 12. ill we realize that we have to investigate the functions F n (9) connected to set 
partitions in P(n,t ) and also the numbers Q(n,t,s,R) and Q*(K,m,t, s, R), respectively. We 
already gave some information on the number S 2 (n,t) of partitions in P(n,t ) earlier. Let us 
be a bit more detailed here. Clearly, by definition ESI of F n and the generating function (12.21) 
we see that 

Ln/2j 

F n {9) = Y, S 2 (n,k)0 k . 

k =1 

(This follows also directly from the proof of Theorem 12.11 1 In particular, F 2n is a polynomial 
of degree n. There are different ways of computing F n explicitly. One possibility is to use the 
generating function (El leading to 

F n (°) = e MQ(e x -x- i))u =0 • 

One may also compute the numbers S 2 (n,k ) explicitly. Indeed, differentiating (12.21) k times 
with respect to y and setting y = 0 yields 

°° r n i 

n =1 
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Expanding the right hand side into a power series and comparing coefficients yields (after some 
computations) 


S 2 {n,k ) 


1 

k! 


k n + 

3 = 1 



n\ 


(n — 




n—l 


(4.1) 


valid for n > 2k (otherwise S 2 (n,k) = 0). In the special case k = 2 we obtain $ 2 (ro, 2) = 
2 n_1 — n — l. Further, a combinatorial argument shows that S2(2n,n ) = (One uses that 
P(2n,n) consists only of partitions where each block has precisely 2 elements.) 

Let us give the first of the functions F 2n explicitly in the following list, 

F 2 (y) = y, F A (y) = y + 3y 2 , F 6 (y) = y + 25y 2 + 15y 3 , 

Fs(y) = y + 119y 2 + 490y 3 + 105y 4 , F 10 (y) = y + 501y 2 + 6825y 3 + 9450y 4 + 945y 5 , 

F 12 (y) = y + 2035 y 2 + 74316y 3 + 302995y 4 + 190575 y 5 + 10395y 6 . 


Of course, explicit values of S 2 (n, k) can be read off this list. 

Now consider the number p n = $ 2 {n,k) of all partitions of [n] into subsets having at 

least two elements. Setting y = 1 in the exponential generating function ( m yields 

OO yi 

y^Pn—r = exp(e x - x - 1). (4.2) 

^' n\ 

n= 1 

Unfortunately, much less is known about the number of partitions in U(n,s). As already 
mentioned, it was only very recently that D. Knuth m posed the problem of determining 
|I/(n,s)|. Let us denote by u n = Ylk= 2 \F(^,k)\ the number of all partitions of {1 ,...,n} 
having no adjacencies (recall that U(n, 1) = 0). Recently, it was proved in [4_ that u n = p n . 
So (14.211 is also the exponential generating function of the numbers u n . Concerning the size 
of U*(K,m, s), up to now, we cannot say more than that it is bounded by the number of all 
partitions into s blocks of a set with Km elements, i.e., by the (ordinary) Stirling number of 
the second kind S(Km,s). If m = 1 then \U*(K, 1, s)| = S(K,s ) as already remarked. The 
Stirling numbers S(n,k ) have the generating function [2. Bi 

oo n n 

Y 5 ( n ’ k ^ yk ^r = ex v(y( eX - x ))- ( 4 - 3 ) 

n=1 k =1 

Let us denote u* Km = Ylk=i W*(K,n,k)\. Then clearly u* Km < bKm = Ylk =l S(Km,k) with 
equality if m = 1. A lower bound for u* K m is given by the numbers pxm- 

Now some elementary observations concerning the numbers Q(n , t, s, R ) and Q*(K, n, t, s, R ) 
can be made. Disregarding the rank of M ( A , £>), the number of all pairs (*4, B) with A £ P(n, t ) 
and B £ U(n, s ) is \P(n,t)\ x \U(n,s)\, hence, ^ Q( n > t,s,R) = \P(n,t)\ x \U(n, s)| and 

similarly for s, R). Summing also over t and s gives 

EEE Q(n,t,s,R) = u n p n = p 2 n 

t s R 

and XX s r Q*(K , m, t, s, R) = PKmU*K m ■ following table we give some values of p n , and 

b n for even n = 2,4,6,... (we omit the odd numbers since we do not need them for Theorem 

ESI). 
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n 

2 

4 

6 

8 

10 

12 

14 

16 

Pn = U n 

1 

4 

41 

715 

17 722 

580 317 

24 011 157 

1 216 070 380 

bn = u n .\ 

2 

15 

203 

4140 

115 975 

4 213 597 

190 899 322 

10 480 142 147 


We determined Q(n , t, s, R ) and Q*(K , m, i, s, A) for certain small n, A', m on a computer in 
the following way. First all partitions in P(n,t ) and U(n,s ) (resp. U*(K : m, s)) are computed 
recursively. For P(n,t ) we have the following procedure: 

1. if n < 2 or t > n/2 then RETURN P(n,t ) = 0. 

2. if t = 1 then RETURN P(n, 1) = {{1,..., n}}. 

3. P(n, t) = 0 

4. compute (recursively) P(n — 1, t) and P{n — 2, t — 1). 

5. for each A £ P(n — 1, t): 

for j from 1 to t: 

create new partition A! by adding the element n to the j-th subset of A 
add A! to P(n, t ) 

6. for each A £ P{n — 2, f — 1): 

for i from 1 to n — 1: 

create new partition A! from A by incrementing each element z £ [n — 2] by 1 if 

i > £ 

and adding the subset {£, n} 
add A! to P(n, t ) 

7. RETURN P(n, t ) 

We remark that from this procedure also the recursion formula (1Q1) follows. 

The partitions in U(n, s ) are determined by first computing the set V (n, s ) of all partitions 
of [n] into s blocks and then omitting those that have adjacencies. Similarly U*(K,n, s) is 
computed. Hereby, we have the following recursive procedure to compute V(n,s ): 

1. if s = 1 RETURN V(n, s ) = {{1,..., n}} 

2. if s = n RETURN V(n, s ) = {{1},..., {n}} 

3. V(n,s) = 0 

4. compute (recursively) V(n — 1, s) and V(n — 1, s — 1) 

5. for each A £ V(n — 1, s): 

for j from 1 to s: 

create new partition A! by adding the element n to the j-th subset of A 
add A! to V (n, s) 

6. for each A £ V{n — 1, s — 1): 

create new partition A 1 = A U {n} 
add A! to V ( n, s ) 

7. RETURN U(n, s) 

One may easily deduce the recursion formula S(n, k) = kS(n — 1, k) + S(n — 1, k — 1) for the 
Stirling numbers of the second kind S(n,k ) = \V(n,k)\ from this procedure. 
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After determining P(n, t ) and U (n, s) for each pair (A, B) with A G P(n, t) and B £ U(n, s ) 
(or B £ U*(K,n,s ) resp.) we set up the matrix M(A,B ) (or L(A,B)), see (12.51) and (12.71) . 
and compute its rank. By counting the number of matrices M ( A , £>) that have rank R we 
determine Q(n, t, s, R ) or Q*(K, m, t, s, R), respectively. The results of these computations for 
certain n, K , m are given in the appendix. Considering the table of the numbers p n (recall that 
Pn equals the overall number of matrices whose rank has to be determined) we realize that 
this procedure is practicable only for small values of n. Even for n = 10 the computing time 
reaches several days and for n = 14 it seems impossible to do the task in a reasonable time as 
pj 4 = 576535660478649. 

The following lemma is concerned with Q(n,i, s, 0) for some special cases. 

Lemma 4.1. (a) Q(n, l,s,0) = \U(n,s)\. 

(b) It holds Q(2n, 2,2,0) = - 1 and Q(2n, 2, 2,1) = 2 2 ”- 1 - 2n - . 

(c) If n > 2 and 2s > 3 n then Q(2n,n, s,0) = 0. 

(d) If t 7^1 then Q(2n, t, 2n, 0) = 0. 

(e) If n > 3 and 3 1 > 2n then Q(2n, t, 2n — 1,0) = 0. 

Proof: (a) There is only one partition in P{n, 1) and the maximal rank of M(A,B ) is 
min{t, s} — 1 = 0. Thus, Q{n , 1, s, 0) = | U(n, s)|. 

(b) Clearly, U(2n,2) consists of only 1 partition B = (Bi,B 2 ), i.e., 

B\ = {1,3,5,...,2n — l}, B 2 = {2,4,6,...,2 n}. 

The associated matrix M = M(A, B), A = {A \, A 2 } € P(2n, 2) has entries 


Mij = \Ai nBj\- I (Ai + i)nBj\ = \Ai nBj\- \Ai n (Bj - 1)1 = \Ai n b 3 \ - \A { n b 3 _j 


since B\ — 1 = B 2 and B 2 — 1 = B\. Thus, M(A, B) has rank 0, i.e., M(A, B) = 0 if and only 
if 

l/^n/lii A\TB 2 \ and \A 2 fl Bf = \A 2 F\ B 2 \. (4.4) 

So A\ and A 2 must have the same number of elements from B± and from B 2 . So we can 
construct all possible partitions A satisfying (IP1) in the following way. Choose m € {1,..., ri¬ 
ll and then form A\ by taking m elements from B\ and m elements from B 2 . The set A 2 is 

( Tl 

different ways. However, if we run with m through {l,...,n — 1} every possible partition 
appears once as {A\,A 2 } and once as {A 2 ,A\}, so that altogether we have the formula 


Q(2n, 2, 2, 0) 



(2n)! 

2^!)2 


The second equality follows from the fact that Ylm=o 
second assertion follows easily since 




see e.g. m • Now the 


Q(2n,2,2,l) 


|P(2n, 2)| - Q(2n, 2, 2,0) = 2 2n " 1 - 1 - 2n - Q(2n, 2, 2,0). 
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(c)-(e) For all the remaining cases we have to prove that for all relevant partitions A £ 
P{2n, t),B £ U(2n , s ) we never have M(A , B) = 0 (the zero-matrix). Observe that M(A, B) = 
0 means that 

\A n B\ = \(A + 1) n B\ for all A £ A, B £ B (4.5) 

(where A +1 is computed modulo n as usual). So for all three cases we assume that A £ P(2n, t) 
and B £ U{2n,s ) are given (with t,s satisfying the respective conditions) and show that the 
condition (El leads to a contradiction. 

(c) Clearly, a partition A in P(2n,n) has only subsets consisting of precisely 2 elements. 
The condition 2s > 3 n implies that a partition B £ U(2n,s) has at least n singletons (i.e. 
subsets consisting of only one element). Indeed, if there would be less than n singletons than 
the overall number of elements would be larger than n — 1 + 2(s — (n — 1)) (i.e. n — 1 sets 
with 1 element and (s — (n — 1)) sets with at least 2 elements). Since n — 1 + 2(s — (n — 1)) = 
2s — n+l>3n — n + l = 2n + l this produces a contradiction as there are only 2 n elements. 

Now, if {k} is a singleton of B and k £ A for A £ A then condition (14. 5 j) implies that also 
{k — 1) £ A. As all subsets A in A have precisely two elements this means that A = {k — 1, k}. 
Using once more (TH>1) we further see that this implies that neither {k — 1} nor {k + 1} can be 
singletons in B. So A has the form 

{{1, 2}, {3,4},... , {2n — 1, 2 re}} 

and the singletons of B are {2}, {4},... , {2n} up to shifting all elements by 1 (modulo n). 
We still have to distribute the remaining numbers 1,3, 5,... ,2n — 1 onto subsets in B. If 
1 £ B £ B then condition EH with A = {1,2} tells us that also 3 £ B. The same argument 
for 3 and A = {3,4} implies that also 5 £ B and so on. So B = {1, 3, 5,... , 2n — 1} and thus, 
s = n + 1. Since n > 2 this is a contradiction to s > 3n/2. Thus, there is no pair of partitions 
A £ P(2n, n), B £ U (2n, s) with M(A , B) = 0. 

(d) The only partition in U(2n, 2n) is B = {{1}, {2},... , {2n}}. Thus the condition (14.511 
implies that whenever j £ A £ A then also j — 1 £ A. As j is arbitrary this means that the 
only possibility for A is {1, 2,... , 2n}, i.e., t = 1. 

(e) The condition on t implies that there is at least one subset A\ £ A £ P(2n,t) that has 
precisely 2 elements. Moreover, any partition in U(2n,2n — 1) has precisely 2n — 2 singletons 
and one subset B\ consisting of precisely 2 elements. We write A\ = {j \, 2}, j’2 > j 1 and 
Bi = {ki, k 2 }. 

We distinguish two cases. Let us first assume J2 7^ j\ + 1 and [ji , ;) 2 } 7^ {l,2n}. All 
singletons in B are given by {k} with k 7^ k\,k 2 - Checking the condition (14.51) with A± and {L} 
shows that necessarily j'2 7^ k for all k 7^ k\,k 2 - Without loss of generality this means j'2 = k 2 ■ 
Condition (TOl) with A\ and B\ thus yields 

I {ji) J2} n {ki, j 2 }\ = \{ji +1, j 2 + 1} n {ki,j 2 }\ 

It is not possible that the sets on both sides have both cardinality 2. Thus, the relation implies 
j 1 / k\. Moreover, since by assumption ji+1 / j 2 either k\ = j\+l or k\ = j'2+l. In both cases 
the singleton {ji} belongs to B. Condition (14.511 yields \{ji, j 2 }n{ji}\ = |{ji + l, j 2 + l}n{ji}| 
which is not possible since j\ / j 2 + 1 by the assumptions j 2 > j\ and {ji,j 2 } 7^ {1, 2 n}. 

Next we treat the case A\ = {ji,j 2 } = {j,j + 1}- Without loss of generality we may 
assume j = 1, so A\ = {1,2}. Checking condition (14.51) with A\ and {k}, k / 1,2 shows that 
k 7^ 1,3. Thus B\ = {1,3} and the singletons of B are the sets {2}, {4}, {5}, {6},... ,{2n}. 
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Figure 2: Bounds for the probability of failure of exact reconstruction due to Theorem 12.11 
with M = 10 ,D = 10000. 


Then condition (IPJ) with A\ and B\ is satisfied. Now, let A be the subset of A containing the 
element 3 and write A = {SjuH'. Then condition 14.51) with B = {4} reads |(4 , U{3})n{4}| = 

| {{A' + 1) U {4}) D {4} | = 1. Thus, A' and hence also A must contain the element 4. We may 
continue in this way to show that A = {3,4, 5,... , 2n}. In particular, t = 2. Since n > 3 this 
is a contradiction to 3t > 2 n. ■ 

One may compare the assertions of this lemma with the tables in the appendix. For 
Q*(K,m,t,s,0) certainly a similar analysis can be done but we have not further pursued this 
issue here. 

5 Bounds for the probability of exact reconstruction 

In this section we illustrate the bounds in Theorems EH and IP for the probability of exact 
reconstruction by drawing some plots. Hereby, we always plotted the bound of the probability 
of failure of exact reconstruction, i.e., 1 minus the expressions in cm> and (ETT51) . 

In figure El we have chosen M = 10, D = 10000 and n = 3,4, 5,6, 7 to show a logarithmic 
plot of the probability bound CUD of Theorem IP versus the number of samples. The 
parameter /3 was chosen always near to 1/2 and then k was determined such that there is 
equality in cm One can see clearly, that here n = 5 or n = 6 is the optimal choice 
depending on the precise value of the number of samples N. Unfortunately, it seems that 
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Figure 3: Probability of failure of exact reconstruction for E|T| =4 , D = 5000 (left) and 
E\T\ = 8, D = 20000 (right) due to Theorem 12.31 


these bounds are quite pessimistic when compared to the numerical experiments (see next 
section). In the given example one needs at least about N = 2000 samples (corresponding to 
a “non-linear oversampling factor” of 200) in order that the bound becomes non-trivial. 

Based on the computation of the explicit values of the numbers Q and Q* we can also 
illustrate the probability bound ESP in Theorem 12.31 Unfortunately, we may only take 
n < 4 since for higher values of n the corresponding numbers Q and Q* are not at our 
disposal. Figure © shows a plot of the bound (12.151) . We have chosen n = 2,3,4 and 
(E\T\,D) = (4,5000), (8,20000) and varied the number N of sampling points. For n = 2 we 
have chosen K\ = 2 ,Ko = 1, for n = 3: K\ = 3, K 2 = 2, K 3 = 1 and for n = 4 we took 
K\ = 4, K '2 = 2,/i3 = 1, /\4 = 1 as suggested in Remark 0a). It turned out that good 
choices for (3 are around 1/2 and for k ~ 10~ 3 (with slight variations for the different choices 
of the other parameters). The remaining parameter a was chosen such that there is equality 
in EH) . 

Looking at the plot one realizes clearly that the bound becomes better for larger n. How¬ 
ever, as above the bounds are still quite pessimistic. Nevertheless, as already remarked one 
expects them to be at least better than the ones of Theorem 12.11 Figure 0 supports this in¬ 
tuition. Indeed, we plotted the different bounds for M = E|T| = 4, D = 5000, n = 4 and 
K\ = 4, A2 = 2, Kj, = 1 and K 4 = 1. Apparently the curve for the bound of Theorem 12.31 is far 
below the one of Theorem 12.11 Unfortunately, we cannot yet use the full strength of Theorem 
IP as we are still lacking an efficient way to actually compute the bound explicitly for higher 
values of n. Actually up to now Theorem 12.11 still gives the better bound in most situations 
because we are able to evaluate EH> for arbitrary n. 

Let us finally discuss possible reasons why the theoretical bounds are quite pessimistic. 
Both theorems give bounds for the probability that exact reconstruction holds for all choices 
of the coefficients / on T, while the numerical experiments in the next section choose also the 
coefficients on T at random. (Of course, it is impossible to check all possible coefficients by 
some algorithm.) Intuitively, it is very plausible that in such an experiment the probability 
of failure of exact reconstruction is much lower than for the situation in our main Theorem 
IP We remark that it seems to be an interesting project to investigate theoretically also the 
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Figure 4: Comparison of the bounds of Theorem 12.11 and Theorem 12.41 for M = E|T| = 4, 
D = 5000 and n = 4. 


case that the coefficients of / on T are chosen at random, see also Section 5 in {7. We plan to 
pursue this issue in a follow-up paper. 

Of course, the theoretical bounds may also be pessimistic compared to reality since some 
of the estimates in the proof are perhaps not sharp. However, it seems to be hard to improve 
on the method of our proof. 

6 Numerical experiments 

Let us describe some numerical tests of the proposed sampling resp. reconstruction method. 
In order to use convex optimization techniques we reformulate the optimization problem 12.11 
as the following equivalent problem, 

min u k subject to \J (cj^) 2 + (cj^) 2 < u k , (6.1) 

k 

I> 

k 

with Uk and cj^ and cl 2 \ k € [— q, q] d , as real optimization variables. The solution to the 
original problem 12 .II is then given as Ck = cj^ + ic^. 

A problem of the above type EB is known as second order cone program [2j. Efficient 
algorithms to solve such problems exist. We have used the toolbox MOSEK (in connection 
with MATLAB), which provides an interior point solver for cone problems. We remark that 
if the coefficients Ck are real-valued then the minimization problem EB can be recast as a 
linear program. 
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Numerical results: number of failures out of 100 trials for M = \T\ = 8 and q = 40 versus 

number of samples 


Our numerical experiment has the following form. We first choose the sparsity M , the 
maximal degree q (we only tested for d = 1) and the number of samples N. Then the following 
steps are done: 

1. Choose a random subset T C [— q, q] of size M from the uniform distribution. (Generate 
a random permutation of [— q, q] and take the first M elements.) 

2. Randomly generate the coefficients c/. for k € T by choosing their real part and imaginary 
part from a standard normal distribution. 

3. Randomly select x\,... , xn independently from the uniform distribution on [0, 2i r]. 

4. Generate fix/) = EfceT c k e lkx i, j = 

5. Solve the minimization problem (ED- 

6. Compare the result to the original vector of coefficients. 

For figure ED we have chosen q = 40, i.e., D = (2 q + 1) = 81 and M = |T| = 8. Then for 
each N between 1 and 40 we ran the above procedure 100 times and counted how often exact 
reconstruction failed. The result is illustrated in the plot. As one can see for N larger than 
30 (corresponding to a non-linear oversampling factor of about 4) our reconstruction method 
always succeeded in giving back the original function exactly! 

Comparing these results with the bounds of Theorem 12.31 as illustrated in the previous 
section one realizes that in practice the method works even much better than we are able 
to predict theoretically. So this method seems to have quite a lot of potential for practical 
applications of signal reconstruction. 
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A Appendix 

A.l Tables for Q(n,t, s, R) and Q*(K,m,t, s, R) 

In the following tables we list some values for the numbers Q(n,t,s,R) and Q*(K,m,t,s,R) 
that were computed by the procedures described in Section For the probability estimation 
(EH51) the numbers Q*(2K m ,m,t,s,R), m = 1,... ,n are needed and we have chosen n = 4 
and K\ = 4,A^2 = 2,K 3 = l,A 7 i = 1 (since for higher numbers of n and K m computing 
times are absurdly long). So the numbers Q*(K,m,t, s, R) have to be computed for ( K,m) = 
(8,1), (4, 2 ), ( 2 , 3), ( 2 ,4). 

Note that Q(n,t,s,R) = 0 if R > min{t, s} or s = 1 or t > n /2 and Q*(n,t,s,R) = 0 
if R > min{i, s}, which is the reason why we do not reproduce these cases. Also recall that 
U(n, 1) = 0 and, unless m = 1, U(K,m , 1) = 0 , hence, Q(n,t , 1, R) = 0 and Q*(K,m,t, 1 ,R) = 

0 . 


<9(4, 1 , S, R) 

0 

II 

a? 

<9(4, 2, a, R) 

0 

II 

R = 1 

s = 2 

1 

CM 

II 

CO 

2 

1 

s = 3 

2 

s = 3 

2 

4 

s = 4 

1 

II 

CO 

0 

3 


<9(6,1, s, R) 

0 

II 

<9(6, 2, a, R) 

0 

II 

cci 

R= 1 

<9(6,3, s, A) 

0 

II 

cci 

R= 1 

R = 2 

s = 2 

1 

s = 2 

9 

16 

s = 2 

6 

9 

0 

s = 3 

10 

s = 3 

46 

204 

s = 3 

15 

78 

57 

s = 4 

20 

s = 4 

45 

455 

s = 4 

5 

87 

208 

s = 5 

9 

s = 5 

9 

216 

s = 5 

0 

18 

117 

s = 6 

1 

s = 6 

0 

25 

s = 6 

0 

0 

15 


<9(8,1, s, R) 

0 

II 

as 

<9(8,2, a, A) 

0 

II 

as 

R= 1 

<9(8,3, s, R) 

0 

II 

as 

R= 1 

R = 2 

s = 2 

l 

s = 2 

34 

85 

s = 2 

72 

418 

0 

s = 3 

42 

s = 3 

674 

4324 

s = 3 

732 

892 

10896 

s = 4 

231 

s = 4 

1970 

25519 

s = 4 

1218 

27446 

84526 

s = 5 

294 

s = 5 

1386 

33600 

s = 5 

504 

19944 

123612 

s = 6 

126 

s = 6 

308 

14686 

s = 6 

56 

4556 

57128 

s = 7 

20 

s = 7 

20 

2360 

s = 7 

0 

304 

9496 

s = 8 

1 

s = 8 

0 

119 

s = 8 

0 

0 

490 

<9(8,4, s, R) 

0 

II 

R = 1 

R = ‘. 

3 

II 

CO 


s = 2 

24 

81 

0 

0 

s = 3 

112 

1208 

3090 

0 

s = 4 

84 

2018 

11944 

10209 

s = 5 

14 

800 

9368 

20688 

s = 6 

0 

86 

2236 

10908 

s = 7 

0 

0 

156 

1944 

s = 8 

0 

0 

0 

105 
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Q*(8, 1,7, s, 77): 


t = 1 

R = 0 

77 = 1 

t = 2 

77 = 0 

77 = 1 

77 = 2 





5 = 1 

1 

0 

s = 1 

34 

85 

0 





s = 2 

34 

93 

s = 2 

610 

6598 

7905 





s = 3 

72 

894 

s = 3 

792 

20420 

93742 





s = 4 

24 

1677 

s = 4 

168 

13736 

188515 





s = 5 

0 

1050 

s = 5 

0 

3380 

121570 





s = 6 

0 

266 

5 = 6 

0 

408 

31246 





5 = 7 

0 

28 

5 = 7 

0 

16 

3316 





s = 8 

0 

1 

5 = 8 

0 

0 

119 





t = 3 

77 = 0 

77 = 1 

77 = 2 

7? = 3 

t = 4 

7? = 0 

77 = 1 

77 = 2 

7? = 3 

77 = 4 

5 = 1 

72 

418 

0 

0 

s = 1 

24 

81 

0 

0 

0 

s = 2 

792 

14572 

46866 

0 

s = 2 

168 

3744 

9423 

0 

0 

s = 3 

792 

25704 

210638 

236206 

5 = 3 

144 

4440 

38472 

58374 

0 

s = 4 

144 

10104 

192512 

630730 

s = 4 

24 

1296 

21060 

96384 

59841 

s = 5 

0 

1368 

63134 

449998 

s = 5 

0 

96 

3816 

37302 

69036 

s = 6 

0 

72 

8700 

121568 

s = 6 

0 

0 

216 

5292 

22422 

s = 7 

0 

0 

400 

13320 

5 = 7 

0 

0 

0 

240 

2700 

s = 8 

0 

0 

0 

490 

5 = 8 

0 

0 

0 

0 

105 


Q*(A,2,t,s,R): 


t = 1 

77 = 0 

77 = 1 

t = 2 

77 = 0 

77 = 1 

77 = 2 





5 = 2 

3 

5 

5 = 2 

47 

415 

490 





5 = 3 

37 

171 

5 = 3 

274 

5866 

18612 





5 = 4 

56 

596 

5 = 4 

226 

9537 

67825 





5 = 5 

21 

555 

5 = 5 

46 

3946 

64552 





5 = 6 

2 

186 

5 = 6 

2 

480 

21890 





5 = 7 

0 

24 

s = 7 

0 

12 

2844 





5 = 8 

0 

1 

5 = 8 

0 

0 

119 





t = 3 

77 = 0 

77 = 1 

77 = 2 

77 = 3 

t = 4 

77 = 0 

77= 1 

77 = 2 

77 = 3 

77 = 4 

5 = 2 

50 

744 

3126 

0 

5 = 2 

8 

108 

724 

0 

0 

5 = 3 

134 

4930 

42410 

54446 

5 = 3 

10 

412 

4910 

16508 

0 

5 = 4 

54 

4070 

70998 

244358 

s = 4 

2 

186 

4377 

30778 

33117 

5 = 5 

4 

776 

31452 

250008 

5 = 5 

0 

17 

962 

14607 

44894 

5 = 6 

0 

26 

4422 

87672 

s = 6 

0 

0 

53 

2266 

17421 

5 = 7 

0 

0 

166 

11594 

5 = 7 

0 

0 

0 

102 

2418 

5 = 8 

0 

0 

0 

490 

5 = 8 

0 

0 

0 

0 

105 


36 













































































































































































































































































































































Q*(2,3,t,a,R): 


t 

= 1 

o 

II 

ft? 

R= 1 

t 

= 2 

o 

II 

R = 1 

R = 2 

t 

= 3 

o 

II 

R= 1 

R = 2 

CO 

II 

s 

= 2 

1 

1 

s 

= 2 

3 

27 

20 

s 

= 2 

1 

9 

20 

0 

s 

= 3 

7 

15 

s 

= 3 

10 

200 

340 

s 

= 3 

1 

23 

166 

140 

s 

= 4 

6 

25 

s 

= 4 

4 

172 

599 

s 

= 4 

0 

7 

132 

326 

s 

= 5 

1 

10 

s 

= 5 

0 

29 

246 

s 

= 5 

0 

0 

21 

144 

s 

= 6 

0 

1 

s 

= 6 

0 

0 

25 

s 

= 6 

0 

0 

0 

15 


Q*(2,4, t, s, R): 


t = 1 

R = 0 

R = 1 

t = 2 

R = 0 

R = 1 

R = 2 





s = 2 

1 

1 

s = 2 

11 

135 

92 





s = 3 

31 

63 

8 = 3 

157 

4225 

6804 





s = 4 

90 

301 

s = 4 

222 

11981 

34326 





s = 5 

65 

350 

s = 5 

69 

8438 

40878 





8 = 6 

15 

140 

s = 6 

5 

1902 

16538 





8 = 7 

1 

21 

s = 7 

0 

124 

2494 





s = 8 

0 

1 

8 = 8 

0 

0 

119 





t = 3 

R = 0 

R = 1 

R = 2 

R = 3 

t = 4 

R = 0 

R = 1 

R = 2 

R = 3 

R = A 

s = 2 

11 

223 

746 

0 

s = 2 

2 

36 

172 

0 

0 

8 = 3 

71 

2974 

23519 

19496 

s = 3 

4 

164 

2393 

7309 

0 

s = 4 

48 

3907 

63319 

124316 

s = 4 

1 

101 

2865 

20438 

17650 

s = 5 

5 

1171 

42404 

159770 

s = 5 

0 

11 

820 

12988 

29756 

s = 6 

0 

85 

9135 

66730 

s = 6 

0 

0 

58 

2643 

13574 

s = 7 

0 

0 

572 

10208 

s = 7 

0 

0 

0 

156 

2154 

8 = 8 

0 

0 

0 

490 

s = 8 

0 

0 

0 

0 

105 
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